ABSTRACT

We study the problem of scheduling tasks in a distributed system where the data (and code) for a program may reside on a processor different from the one where it will be executed. The scheduling of the tasks is more complex than classical ones as one must not only take into consideration the processing times but also communication times. We present an off-line polynomial time approximation algorithm for the case when the processors can be partitioned into storage (client) and processing (server) nodes. Our algorithm is the first constant ratio approximation algorithm for this problem. Then we discuss generalizations of our problem, including an on-line distributed version, as well as versions that allow tasks to access multiple input files and generate multiple output files that reside in one or more nodes.

Keywords: - Approximation Algorithms, Dual Objective Functions, Minimize Makespan, Scheduling.